Exploiting hidden information interleaved in the redundancy of the genetic code without prior knowledge
نویسندگان
چکیده
MOTIVATION Dozens of studies in recent years have demonstrated that codon usage encodes various aspects related to all stages of gene expression regulation. When relevant high-quality large-scale gene expression data are available, it is possible to statistically infer and model these signals, enabling analysing and engineering gene expression. However, when these data are not available, it is impossible to infer and validate such models. RESULTS In this current study, we suggest Chimera-an unsupervised computationally efficient approach for exploiting hidden high-dimensional information related to the way gene expression is encoded in the open reading frame (ORF), based solely on the genome of the analysed organism. One version of the approach, named Chimera Average Repetitive Substring (ChimeraARS), estimates the adaptability of an ORF to the intracellular gene expression machinery of a genome (host), by computing its tendency to include long substrings that appear in its coding sequences; the second version, named ChimeraMap, engineers the codons of a protein such that it will include long substrings of codons that appear in the host coding sequences, improving its adaptation to a new host's gene expression machinery. We demonstrate the applicability of the new approach for analysing and engineering heterologous genes and for analysing endogenous genes. Specifically, focusing on Escherichia coli, we show that it can exploit information that cannot be detected by conventional approaches (e.g. the CAI-Codon Adaptation Index), which only consider single codon distributions; for example, we report correlations of up to 0.67 for the ChimeraARS measure with heterologous gene expression, when the CAI yielded no correlation. AVAILABILITY AND IMPLEMENTATION For non-commercial purposes, the code of the Chimera approach can be downloaded from http://www.cs.tau.ac.il/∼tamirtul/Chimera/download.htm. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
Genome analysis Exploiting hidden information interleaved in the redundancy of the genetic code without prior knowledge
Motivation: Dozens of studies in recent years have demonstrated that codon usage encodes various aspects related to all stages of gene expression regulation. When relevant high-quality large-scale gene expression data are available, it is possible to statistically infer and model these signals, enabling analysing and engineering gene expression. However, when these data are not available, it is...
متن کاملA serial concatenation approach to iterative demodulation and decoding
Iterative demodulation and decoding of convolutionally encoded data is treated as a special case of the recently proposed serial concatenation of interleaved codes. It is shown that by exploiting the recursive nature of the differential modulation schemes (for example, DBPSK, DQPSK, CPM, etc.), large interleaving gains can be achieved similar to serial concatenation schemes. We also show that w...
متن کاملThe Effects of Oral Code-mixing and Glossing on Iranian EFL Learners' Vocabulary Knowledge
The current study investigated the effects of oral code-mixing and glossing on L2 vocabulary learning. To this end, 60 EFL learners studying at pre-university school were given a pre-test to make sure that they did not have any prior knowledge of the target words. Based on their scores in the pre-test, 36 pre-university students were selected and divided into three groups, including two experim...
متن کاملA Memtic genetic algorithm for a redundancy allocation problem
Abstract In general redundancy allocation problems the redundancy strategy for each subsystem is predetermined. Tavakkoli- Moghaddam presented a series-parallel redundancy allocation problem with mixing components (RAPMC) in which the redundancy strategy can be chosen for individual subsystems. In this paper, we present a bi-objective redundancy allocation when the redundancy strategies for...
متن کاملReliability Optimization for Complicated Systems with a Choice of Redundancy Strategies (TECHNICAL NOTE)
Redundancy allocation is one of the common techniques to increase the reliability of the bridge systems. Many studies on the general redundancy allocation problems assume that the redundancy strategy for each subsystem is predetermined and fixed. In general, active redundancy has received more attention in the past. However, in real world, a particular system design contains both active and col...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 31 8 شماره
صفحات -
تاریخ انتشار 2015